Integration of Face and Voice Recognition

ثبت نشده
چکیده

cepstral features and features based on a bio–mechanical model of the visible articulators will be the identity–carrying characteristics extracted from acoustic speech and visual speech respectively. Speakers will be modelled by multi–layer perceptrons trained as discriminative models or, alternatively, as predictive models. In the discriminative modelling scheme, each speaker model will be trained to recognize its allocated speaker directly from his acoustic and visual speech. A measure, based on the cross–correlation between the motion of visible articulators and the acoustic speech, will be used for detecting the impostors who would use facial images and voice not originating from the same person (e.g. by using tape–recorded voice and mimicking the movement of visible articulators). The predictive modelling scheme is based on the belief that acoustic and visual speech are cross–correlated. Hence, one may be predicted from the other. Another assumption is that the mapping from acoustic speech to visual speech is speaker–specific; hence, each speaker will be modelled by an MLP trained to perform acoustic–to–visual speech mappings (prediction) for his speech. The prediction error for each speaker model will then act as the recognition measure. The lower the error, the better the model fits the given acoustic and visual speech. CONCLUSION Preliminary investigations have shown that person recognition accuracy can be improved by the joint use of vocal and facial information. It remains to be seen whether acoustic speech used in conjunction with visual speech will also yield improved recognition accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Benefits for Voice Learning Caused by Concurrent Faces Develop over Time.

Recognition of personally familiar voices benefits from the concurrent presentation of the corresponding speakers' faces. This effect of audiovisual integration is most pronounced for voices combined with dynamic articulating faces. However, it is unclear if learning unfamiliar voices also benefits from audiovisual face-voice integration or, alternatively, is hampered by attentional capture of ...

متن کامل

Implementation of Face Recognition Algorithm on Fields Programmable Gate Array Card

The evolution of today's application technologies requires a certain level of robustness, reliability and ease of integration. We choose the Fields Programmable Gate Array (FPGA) hardware description language to implement the facial recognition algorithm based on "Eigen faces" using Principal Component Analysis. In this paper, we first present an overview of the PCA used for facial recognition,...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Effect of Sensor Fusion for Recognition of Emotional States Using Voice, Face Image and Thermal Image of Face

A new integration method is presented to recognize the emotional expressions of human. We attempt to use both voices and facial expressions. For voices, we use such prosodic parameters as pitch signals, energy, and their derivatives, which are trained by Hidden Markov Model (HMM) for recognition. For facial expressions, we use feature parameters from thermal images in addition to visible images...

متن کامل

Functional Connectivity between Face-Movement and Speech-Intelligibility Areas during Auditory-Only Speech Perception

It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of function...

متن کامل

Familiar face and voice matching and recognition in children with autism.

Relatively able children with autism were compared with age- and language-matched controls on assessments of (1) familiar voice-face identity matching, (2) familiar face recognition, and (3) familiar voice recognition. The faces and voices of individuals at the children's schools were used as stimuli. The experimental group were impaired relative to the controls on all three tasks. Face recogni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993